Quasi-experiments in epidemiology

Lee Kennedy-Shaffer, PhD

2025-06-10

How Will I Know?

Image of Whitney Houston's How Will I Know album cover

About Me

  • Assistant Professor at Yale School of Public Health

  • Teach statistical modeling and study design

  • Research focus on infectious disease study design and cluster-randomized trials

Image of presenter—32-year-old white man with dark hair—holding a water pump

Rise of Quasi-Experiments

Citation for Nobel Memorial Prize in Economic Sciences from 2021

Text from the final paragraph of popular press release for Nobel Memorial Prize in Economic Sciences from 2021

QEs in Economics and Political Science

Title and abstract for Card and Krueger (1994)

Title and abstract for Abadie and Gardeazabal (2003)

Title for Abadie et al. (2010)

QEs in Epidemiology and Public Health

Title of Craig et al. (2017)

Title of Wing et al. (2018)

QEs in Epidemiology and Public Health

Title and citation of Waddington et al. (2017)

Citation for Matthay and Glymour (2022)

Considering the Role of Evidence

Title of Nianogo et al. (2023)

Title of Matthay et al. (2020)

  • Allows use of routinely-collected data

  • Evaluates interventions in-context

  • Provides “real world evidence”/population impact

  • Answers questions randomized trials and observational studies cannot

  • But … has threats to internal and external validity

Workshop Details

Workshop Plan: Part I

8:30–9:00 Introduction to difference-in-differences

9:00–9:45 Advanced DID and staggered adoption

9:45–10:30 Analysis 1: Advanced DID of COVID-19 vaccine mandates

Workshop Plan: Part II

10:40–11:15 Introduction to synthetic control

11:15–11:45 Analysis 3: SC of Ohio’s COVID-19 vaccine lottery

11:45–12:15 Advanced SC methods

12:15–12:30 Analysis 4: Advanced SC of multiple states’ COVID-19 vaccine lotteries

Workshop Goals

  • Understand, interpret, and critique the use of DID and SC in epidemiology

  • Gain familiarity with state-of-the-art methods related to DID and SC and identify resources for further exploration

  • Contextualize the assumptions needed for causal inference from quasi-experiments

  • Implement staggered adoption DID and SC analyses and diagnostics/inference in R

A Note on the Examples

I will focus here on infectious disease examples from published literature with available data. Some issues are specific to ID, while others are not, but they illustrate the points of how to approach these questions.

Title of Lopez Bernal et al. (2019)

Title for Kennedy-Shaffer (2024)

Title for Goodman-Bacon and Marcus (2020)

Title for Feng and Bilinski (2024)

Let’s Get To It!

All materials: https://github.com/leekshaffer/Epi-QEs/

QR code for above link

Standard Difference-in-Differences

Motivating Example: Cholera, London, 1850s

Map of water service areas in 1850s London

Citation information for Coleman (2019)

Title of Caniglia and Murray (2020)

Setting

  • Two (or more) units: some treated/exposed, some untreated

  • Two (or more) time periods: some prior to first treatment, some after

Example: South London “Grand Experiment” from Coleman 2024

Untreated: Southwark & Vauxhall Districts (12)

Treated: Joint Southwark & Vauxhall/Lambeth Districts (16)

Time Periods: 1849 (pre-treatment) and 1854 (post-treatment) outbreaks

Potential Outcomes and Treatment Effect

Unit Pre-Treatment Post-Treatment
Exposed \(Y_{10} = Y_{10}(0)\) \(Y_{11} = Y_{11}(1)\)
Unexposed \(Y_{00} = Y_{00}(0)\) \(Y_{01} = Y_{01}(0)\)

Treatment Effect:

\[\theta = E[Y_{11}(1) - Y_{11}(0)]\]

Change Over Time

Within each unit, we have an interrupted time series:

\[ \begin{aligned} \Delta_1 &= Y_{11} - Y_{10} \\ \Delta_0 &= Y_{01} - Y_{00} \end{aligned} \]

Key Idea

Use the observed \(\Delta_0\) under control as the potential outcome for the unobserved \(\Delta_1\) under treatment.

Two-by-Two DID

\[ \begin{aligned} \hat{Y}_{11}(1) &= Y_{11} \\ \hat{Y}_{11}(0) &= Y_{10} + \color{darkgreen}{(Y_{01} - Y_{00})} \\ \hat{\theta} &= \color{purple}{(Y_{11} - Y_{10})} - \color{darkgreen}{(Y_{01} - Y_{00})} \\ \end{aligned} \]

Two-by-Two DID: Example

Supplier Sub-Districts 1849 Deaths per 10,000 1854 Deaths per 10,000
Joint Southwark & Vauxhall/Lambeth (Treated) 16 130.1 84.9
Southwark & Vauxhall Only (Untreated) 12 134.9 146.6

Two-by-Two DID: Example

Supplier 1849 Deaths per 10,000 1854 Deaths per 10,000 Diff, 1854-1849
Joint Southwark & Vauxhall/Lambeth (Treated) 130.1 84.9 -45.2
Southwark & Vauxhall Only (Untreated) 134.9 146.6 11.8
Diff, Treated-Untreated -4.8 -61.8 -57.0

Two-by-Two DID: Graphically

Two-by-Two DID: Graphically

Controlled ITS

Compare to other possible estimates of \(\hat{Y}_{11}(0)\):

  • \(Y_{10}\): assumes no time trends

  • \(Y_{01}\): assumes no differences in units

  • Modeled trend for \(Y_{1t}\) over time: requires time model and more data

  • Regress \(Y_{1t}\) on \(Y_{0t}\): requires covariates, additional control units, and/or specific model

Details and Assumptions

Regression Formulation

\[ Y_{it} = \alpha_i + \gamma_t + \theta I(X_{it} = 1)+\epsilon_{it}, \]

where:

  • \(\alpha_i\) is the fixed effect for unit \(i\),

  • \(\gamma_t\) is the fixed effect for time \(t\),

  • \(\epsilon_{it}\) is the error term for unit \(i\) in time \(t\), and

  • \(X_{it}\) is the indicator of whether unit \(i\) is treated at time \(t\).

Note

This is called the two-way fixed effects (TWFE) model for DID.

Statistical Inference

Inference can be conducted using the TWFE regression model. This accounts for variability in the outcome if there are multiple treated/untreated units and multiple periods.

Generally, the standard errors are clustered by unit to account for correlation. This can also be done with a block-bootstrap variance estimation.

Caution

This accounts for statistical uncertainty but not causal uncertainty in the model assumptions. Those cannot be fully assessed statistically.

Key Assumptions

  • Parallel trends (in expectation of potential outcomes):

    \[ E[\color{purple}{Y_{11}(0) - Y_{10}(0)}] = E[\color{darkgreen}{Y_{01}(0) - Y_{00}(0)}] \]

  • No spillover

  • No anticipation/clear time point for treatment

Improving Assumptions: Re-scale

Changing the scale of the outcome changes the parallel trends assumption. The most common transformation is to use the natural log.

E.g., \(\log(Y_{it}) = \alpha_i + \gamma_t + \theta I(X_{it}=1) + \epsilon_{it}\)

Changes parallel trends assumption to:

\[ \begin{aligned} E[\color{purple}{\log Y_{11}(0) - \log Y_{10}(0)}] &= E[\color{darkgreen}{\log Y_{01}(0) - \log Y_{00}(0)}] \\ E \left[ \log \left( \color{purple}{\frac{Y_{11}(0)}{Y_{10}(0)}} \right) \right] &= E \left[ \log \left( \color{darkgreen}{\frac{Y_{01}(0)}{Y_{00}(0)}} \right) \right] \end{aligned} \]

Improving Assumptions: Re-scale

Caution

Improving Assumptions: Covariates

Incorporating covariates makes the parallel trends assumption conditional on those covariates.

E.g., \(Y_{it} = \alpha_i + \gamma_t + \theta I(X_{it}=1) + \beta Z_{i} + \epsilon_{it}\)

Changes parallel trends assumption to:

\[ E[\color{purple}{Y_{11}(0) - Y_{10}(0)} ~ | ~ Z_1] = E[\color{darkgreen}{Y_{01}(0) - Y_{00}(0)} ~ | ~ Z_0] \]

Improving Assumptions: Covariates

Caution

  • This makes the parallel trends assumption more complex to consider and requires modeling covariates

  • This changes the estimand and assumes the effect is homogeneous across covariates

    See Caetano and Callaway (2023) for issues that arise with time-varying covariates.

Epidemiologic Considerations

ATT Estimand

Estimand Interpretation

DID estimates the Average Treatment Effect on the Treated (ATT).

This may not be generalizable to other units, including the untreated units in the study.

Internal vs. External Validity

  • Internal validity may be high if the assumptions are justified.

  • External validity may be low because of limited transportability of the ATT and limited information on effect heterogeneity.

Bias vs. Variance

  • Incorporating additional units/periods can reduce variance, but may also risk violating the assumptions

  • Generally conducted with limited, carefully-selected units: low bias but high variance

Examples

  • More distant vs. closer untreated units

  • Incorporating more untreated units

  • Incorporating more recent time periods

Summary: DID for Epidemiology

Advantages:

  • Simple to implement

  • Uses summary data

  • No need to model time trends or collect covariates

  • Straightforward interpretation

Disadvantages/Limitations:

  • Targets ATT not ATE

  • Need to justify key assumptions

  • Requires careful selection of controls

  • Limited inference with few units/periods

Questions